AITopics | Springfield

Collaborating Authors

Springfield

Monet: Mixture of Monosemantic Experts for Transformers

Park, Jungwoo, Ahn, Young Jin, Kim, Kee-Eung, Kang, Jaewoo

arXiv.org Artificial IntelligenceDec-9-2024

Understanding the internal computations of large language models (LLMs) is crucial for aligning them with human values and preventing undesirable behaviors like toxic content generation. However, mechanistic interpretability is hindered by polysemanticity -- where individual neurons respond to multiple, unrelated concepts. While Sparse Autoencoders (SAEs) have attempted to disentangle these features through sparse dictionary learning, they have compromised LLM performance due to reliance on post-hoc reconstruction loss. To address this issue, we introduce Mixture of Monosemantic Experts for Transformers (Monet) architecture, which incorporates sparse dictionary learning directly into end-to-end Mixture-of-Experts pretraining. Our novel expert decomposition method enables scaling the expert count to 262,144 per layer while total parameters scale proportionally to the square root of the number of experts. Our analyses demonstrate mutual exclusivity of knowledge across experts and showcase the parametric knowledge encapsulated within individual experts. Moreover, Monet allows knowledge manipulation over domains, languages, and toxicity mitigation without degrading general performance. Our pursuit of transparent LLMs highlights the potential of scaling expert counts to enhance mechanistic interpretability and directly resect the internal knowledge to fundamentally adjust model behavior. The source code and pretrained checkpoints are available at https://github.com/dmis-lab/Monet.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.04139

Country:

Europe > United Kingdom > England > Staffordshire (0.04)
North America > United States > Florida (0.04)
Oceania > New Zealand (0.04)
(32 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Banking & Finance (1.00)
Government > Regional Government (0.68)
Health & Medicine > Therapeutic Area > Oncology (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Add feedback

Outlier-robust Kalman Filtering through Generalised Bayes

Duran-Martin, Gerardo, Altamirano, Matias, Shestopaloff, Alexander Y., Sánchez-Betancourt, Leandro, Knoblauch, Jeremias, Jones, Matt, Briol, François-Xavier, Murphy, Kevin

arXiv.org Machine LearningMay-28-2024

We derive a novel, provably robust, and closed-form Bayesian update rule for online filtering in state-space models in the presence of outliers and misspecified measurement models. Our method combines generalised Bayesian inference with filtering methods such as the extended and ensemble Kalman filter. We use the former to show robustness and the latter to ensure computational efficiency in the case of nonlinear models. Our method matches or outperforms other robust filtering methods (such as those based on variational Bayes) at a much lower computational cost. We show this empirically on a range of filtering problems with outlier measurements, such as object tracking, state estimation in high-dimensional chaotic systems, and online learning of neural networks.

artificial intelligence, machine learning, outlier-robust kalman filtering, (15 more...)

arXiv.org Machine Learning

2405.05646

Country:

Europe > United Kingdom > England > Greater London > London (0.14)
Europe > Austria > Vienna (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(5 more...)

Genre:

Research Report (1.00)
Overview (0.92)

Industry: Education > Educational Setting (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Trainable Loss Weights in Super-Resolution

Mellatshahi, Arash Chaichi, Kasaei, Shohreh

arXiv.org Artificial IntelligenceNov-27-2023

In recent years, limited research has discussed the loss function in the super-resolution process. The majority of those studies have only used perceptual similarity conventionally. This is while the development of appropriate loss can improve the quality of other methods as well. In this article, a new weighting method for pixel-wise loss is proposed. With the help of this method, it is possible to use trainable weights based on the general structure of the image and its perceptual features while maintaining the advantages of pixel-wise loss. Also, a criterion for comparing weights of loss is introduced so that the weights can be estimated directly by a convolutional neural network. In addition, in this article, the expectation-maximization method is used for the simultaneous estimation super-resolution network and weighting network. In addition, a new activation function, called "FixedSum", is introduced which can keep the sum of all components of vector constants while keeping the output components between zero and one. As experimental results shows, weighted loss by the proposed method leads to better results than the unweighted loss and weighted loss based on uncertainty in both signal-to-noise and perceptual similarity senses on the state-of-the-art networks. Code is available online.

image super-resolution, proceedings, weighting network, (14 more...)

arXiv.org Artificial Intelligence

2301.10575

Country: North America > United States > Virginia > Fairfax County > Springfield (0.04)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Large Language Models are Diverse Role-Players for Summarization Evaluation

Wu, Ning, Gong, Ming, Shou, Linjun, Liang, Shining, Jiang, Daxin

arXiv.org Artificial IntelligenceSep-19-2023

Text summarization has a wide range of applications in many scenarios. The evaluation of the quality of the generated text is a complex problem. A big challenge to language evaluation is that there is a clear divergence between existing metrics and human evaluation. A document summary's quality can be assessed by human annotators on various criteria, both objective ones like grammar and correctness, and subjective ones like informativeness, succinctness, and appeal. Most of the automatic evaluation methods like BLUE/ROUGE may be not able to adequately capture the above dimensions. In this paper, we propose a new evaluation framework based on LLMs, which provides a comprehensive evaluation framework by comparing generated text and reference text from both objective and subjective aspects. First, we propose to model objective and subjective dimensions of generated text based on roleplayers prompting mechanism. Furthermore, we introduce a context-based prompting mechanism that is able to generate dynamic roleplayer profiles based on input context. Finally, we design a multi-roleplayer prompting technology based on batch prompting and integrate multiple outputs into the final evaluation results. Experimental results on three real datasets for summarization show that our model is highly competitive and has a very high consistency with human annotators.

dimension, evaluation, roleplayer, (13 more...)

arXiv.org Artificial Intelligence

2303.15078

Country:

North America > United States > Nevada (0.05)
North America > United States > Virginia > Fairfax County > Springfield (0.04)
Asia > China (0.04)

Genre: Research Report (0.64)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.95)
Law (0.95)
Media > News (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Differentiable Rendering for Synthetic Aperture Radar Imagery

Wilmanski, Michael, Tamir, Jonathan

arXiv.org Artificial IntelligenceAug-7-2023

There is rising interest in differentiable rendering, which allows explicitly modeling geometric priors and constraints in optimization pipelines using first-order methods such as backpropagation. Incorporating such domain knowledge can lead to deep neural networks that are trained more robustly and with limited data, as well as the capability to solve ill-posed inverse problems. Existing efforts in differentiable rendering have focused on imagery from electro-optical sensors, particularly conventional RGB-imagery. In this work, we propose an approach for differentiable rendering of Synthetic Aperture Radar (SAR) imagery, which combines methods from 3D computer graphics with neural rendering. We demonstrate the approach on the inverse graphics problem of 3D Object Reconstruction from limited SAR imagery using high-fidelity simulated SAR data.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2204.01248

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Virginia > Fairfax County > Springfield (0.04)
North America > United States > Texas > Harris County > Houston (0.04)
(6 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Knowledge Distilled Ensemble Model for sEMG-based Silent Speech Interface

Lai, Wenqiang, Yang, Qihan, Mao, Ye, Sun, Endong, Ye, Jiangnan

arXiv.org Artificial IntelligenceAug-6-2023

Abstract--Voice disorders affect millions of people worldwide. Our findings also shed light on an endto-end system for portable, practical equipment. Most recently, deep learningbased methods have thrived and significantly improved over I. AlterEgo, utilising CNN, proposed a product that did not require users explicitly mouth their Normal communication is not always possible. According speech with pronounced, apparent facial movements [10]. Diseases that lead to method to classify the International Radiotelephony language impairments include brain injuries (e.g., aphasia, Spelling Alphabet with a commercially off-the-shelf (COTS) apraxia, and dysarthria) and voice disorders, where there are device.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2308.06533

Country:

Europe > United Kingdom > England > Greater London > London (0.05)
North America > Canada > Quebec (0.05)
Asia > India (0.05)
North America > United States > Virginia > Fairfax County > Springfield (0.04)

Genre: Research Report > New Finding (0.88)

Industry:

Education (0.47)
Health & Medicine > Therapeutic Area > Neurology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science (0.94)

Add feedback

Acoustic Beamforming for Object-relative Distance Estimation and Control in Unmanned Air Vehicles using Propulsion System Noise

Sharma, Alisha, Geder, Jason, Lingevitch, Joseph, Martin, Theodore, Lofaro, Daniel, Sofge, Donald

arXiv.org Artificial IntelligenceApr-15-2023

Unmanned air vehicles often produce significant noise from their propulsion systems. Using this broadband signal as "acoustic illumination" for an auxiliary sensing system could make vehicles more robust at a minimal cost. We present an acoustic beamforming-based algorithm that estimates object-relative distance with a small two-microphone array using the generated propulsion system noise of a vehicle. We demonstrate this approach in several closed-loop distance feedback control tests with a mounted quad-rotor vehicle in a noisy environment and show accurate object-relative distance estimates more than 2x further than the baseline channel-based approach. We conclude that this approach is robust to several practical vehicle and noise situations and shows promise for use in more complex operating environments.

algorithm, artificial intelligence, microphone, (12 more...)

arXiv.org Artificial Intelligence

2304.07596

Country:

North America > United States > Oregon > Multnomah County > Portland (0.14)
North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > Virginia > Fairfax County > Springfield (0.04)
(3 more...)

Genre: Research Report (0.40)

Industry:

Aerospace & Defense (0.70)
Energy (0.52)
Government > Military (0.46)

Technology: Information Technology > Artificial Intelligence > Robots (0.47)

Add feedback

Pentagon goes on AI hiring spree to bring machine learning capabilities to the battlefield

FOX NewsApr-13-2023, 14:15:16 GMT

'The Five' discuss how AI generated images are getting harder to distinguish from reality and how the Dalai Lama asked a young boy to suck his tongue. The Pentagon is hiring data scientists, technologists and engineers as part of its effort to incorporate artificial intelligence into the machinery used to wage war. The Defense Department has posted several AI jobs on USAjobs.gov over the last few weeks, including many with salaries well into six figures. One of the higher paying jobs advertised in the last few weeks is for a senior technologist for "cognitive and decision science" at the U.S. Navy's Point Loma Complex in San Diego. That job starts at $170,000 and could pay as much as $212,000 year for someone who can help insert "cutting-edge technology" into Navy weaponry and equipment.

artificial intelligence, battlefield, scientist, (15 more...)

FOX News

Country:

North America > United States > California > San Diego County > San Diego (0.26)
North America > United States > Virginia > Fairfax County > Springfield (0.06)
North America > United States > Texas > Travis County > Austin (0.06)
(2 more...)

Industry:

Government > Military (1.00)
Government > Regional Government > North America Government > United States Government (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.53)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.32)

Add feedback

Toward Defining a Domain Complexity Measure Across Domains

Doctor, Katarina, Task, Christine, Kildebeck, Eric, Kejriwal, Mayank, Holder, Lawrence, Leong, Russell

arXiv.org Artificial IntelligenceMar-7-2023

Artificial Intelligence (AI) systems planned for deployment in real-world applications frequently are researched and developed in closed simulation environments where all variables are controlled and known to the simulator or labeled benchmark datasets are used. Transition from these simulators, testbeds, and benchmark datasets to more open-world domains poses significant challenges to AI systems, including significant increases in the complexity of the domain and the inclusion of real-world novelties; the open-world environment contains numerous out-of-distribution elements that are not part in the AI systems' training set. Here, we propose a path to a general, domain-independent measure of domain complexity level. We distinguish two aspects of domain complexity: intrinsic and extrinsic. The intrinsic domain complexity is the complexity that exists by itself without any action or interaction from an AI agent performing a task on that domain. This is an agent-independent aspect of the domain complexity. The extrinsic domain complexity is agent- and task-dependent. Intrinsic and extrinsic elements combined capture the overall complexity of the domain. We frame the components that define and impact domain complexity levels in a domain-independent light. Domain-independent measures of complexity could enable quantitative predictions of the difficulty posed to AI systems when transitioning from one testbed or environment to another, when facing out-of-distribution data in open-world tasks, and when navigating the rapidly expanding solution and search spaces encountered in open-world domains.

artificial intelligence, complexity, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2303.04141

Country:

North America > United States > California (0.14)
North America > United States > Washington (0.04)
North America > United States > Virginia > Fairfax County > Springfield (0.04)
(6 more...)

Genre: Research Report (0.82)

Industry:

Government (0.93)
Leisure & Entertainment > Games > Computer Games (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.88)

Add feedback

Data Scientist at Novetta - Springfield, Virginia

#artificialintelligenceFeb-15-2023, 00:10:29 GMT

Accenture Federal Services delivers a range of innovative, tech-enabled services for the U.S. Federal Government to address the complex, sensitive challenges of national security and intelligence missions. Refer a qualified candidate and earn up to $20K. Accenture Federal Services is seeking a Data Scientist to analyze, design, code and test multiple components of application code across one or more clients. Compensation for roles at Accenture Federal Services varies depending on a wide array of factors including but not limited to the specific office location, role, skill set and level of experience. As required by local law, Accenture Federal Services provides a reasonable range of compensation for roles that may be hired in California, Colorado, New York City or Washington as set forth below and information on benefits offered is here.

accenture federal service, employment opportunity, information, (13 more...)

#artificialintelligence

Country:

North America > United States > Virginia > Fairfax County > Springfield (0.40)
North America > United States > New York (0.27)
North America > United States > Colorado (0.27)
North America > United States > California (0.27)

Industry: Government (1.00)

Technology:

Information Technology > Data Science (0.61)
Information Technology > Artificial Intelligence (0.40)

Add feedback